Jimmy Song

Making AI infrastructure understandable, verifiable, and participatory

I focus on how infrastructure software forms real developer ecosystems in the AI era: turning GPU governance, scheduling, inference, and agent runtime into clear methodology, practical paths, and sustainable open collaboration.

Systematic infrastructure thinking Connecting product, developers, and community feedback From cloud native to AI/GPU systems Writing and tools as methodology

Explore the Value Proposition View Focus Areas

AI Infrastructure Ecosystem View

I usually read AI infrastructure through four layers: applications and agents at the top, runtime, inference, training, and governance in the middle, and GPU plus accelerated infrastructure underneath. Each layer needs clear resource boundaries, engineering abstractions, and developer participation paths.

Agent / AI Applications

Agentic Runtime & Context

Inference · Training · Governance

GPU & Accelerated Infrastructure

SUBSTRATE-COMPATIBLE Built on Kubernetes and cloud-native systems, with AI-specific semantic extensions
NON-DETERMINISM AI workloads are non-deterministic by nature
AGENT-FIRST Agents, not services, are primary execution unit
FIRST-CLASS RESOURCES GPU, context, and tokens become first-class resources
VALIDATION-FIRST AI systems require continuous validation loops for runtime correctness, not just successful execution
GOVERNANCE > DEPLOY Scheduling is the primary control plane; deployment is a secondary concern

→ Study the evolution from cloud-native to AI-native in 'AI Native Infrastructure'

Focus Areas and Directions

My work is not just explaining technology. It builds judgment, expression, and collaboration around the key links of infrastructure ecosystems.

AI/GPU Infrastructure

GPU scheduling, inference systems, agent runtime, resource abstractions, and platform engineering for AI workloads.

Cloud-Native System Evolution

How Kubernetes, service mesh, and platform engineering extend toward AI-era resource governance, elasticity, and multi-tenancy.

Open Source Ecosystem and DX

Documentation, tutorials, contribution paths, community communication, and developer activities that make complex infrastructure learnable and participatory.

How the Methodology Lands

A value proposition needs long-term validation. I use writing, books, landscape maps, open-source communities, and developer activities to turn abstract judgment into discussable, learnable, and collaborative material.

Direction Evolution

2017-2022

Cloud-native and Kubernetes phase: focused on container orchestration and platform fundamentals, including Kubernetes, Cloud Native Go, Cloud Native Java, Cloud Native Patterns, and Cloud Native Infrastructure.

2022-2024

Service mesh and microservices phase: deepened governance and traffic architecture through Istio, migration-focused microservice architecture, and Envoy-centered engineering practices.

2025-Now

AI Native Infra and AI phase: built a methodology from AI engineering to infrastructure with the RAG handbook, agentic design patterns, AI handbook, GPU scheduling/virtualization, AI Native Infrastructure, and AI Infra Dao.

Working Method

Problem-First

Start from real production pain points before introducing concepts and abstractions.

Systematic Decomposition

Break complex topics into resource model, runtime, platform engineering, and governance.

Verifiable Claims

Anchor conclusions in concrete projects, measurable signals, and reproducible cases.

Long-Horizon Iteration

Use posts, books, and talks to cross-validate ideas and continuously refine the boundary of practice.

AI Native Landscape

Turn methodology into execution: a curated and continuously updated directory of open-source AI projects and tools.

Updated Regularly

Explore AI Open Source Resources

Browse a structured AI resource list for agents, AI coding tools, model infrastructure, and engineering workflows.

View AI Resource List

Latest Practical Notes

Recent engineering updates and practical notes that continue the research threads above.

Why GPU Is the Foundation of AI

Jun 17, 2026 • AI Engineering

A GPU explainer for Kubernetes veterans new to AI. Maps token, model, training, inference, Transformer, Tensor Core, HBM, and KV cache to concepts you already know.

GPU Utilization Is Breaking

Jun 17, 2026 • AI Engineering

From GPU utilization to productive GPU-hours.

Browse All

About Jimmy Song

Jimmy focuses on AI-Native Infrastructure and computing governance, with long-term research on GPU virtualization, heterogeneous scheduling, and system-level architecture for AI workloads. He is Open Source Ecosystem VP at Dynamia.ai, CNCF Ambassador, and founder of the Cloud Native Community (China), and continues driving the shift from cloud-native to AI-native engineering.